Skip to main content

All Questions

Tagged with
2votes
0answers
9views

Low Accurecy from Geospatial Random forest ML modeling problem - Training Exported from qGIS, SCP

I am doing a geospatial assessment integrated with ML modeling. The problem is the very low accuracy percentage, as more training features increases, it gets lower. What could be he solution to such a ...
Reem 's user avatar
0votes
0answers
12views

Isolation Forest sample size

I am using sklearn's Isolation Forest as a model to detect anomalies. My dataset is relatively small, 50 records with only 2-3 features. To prevent any overfitting, what would you recommend to tune ...
Mar's user avatar
  • 85
3votes
1answer
37views

Confirm understanding of decision_function in Isolation Forest sklearn

I am looking to better understand sklearn IsolationForest decision_function. My understanding is that if the metric is closer to -1 then the model is more confident ...
Mar's user avatar
  • 85
2votes
0answers
20views

Preprocessing multivalue attributes in a dataframe, similar to Nominal

Description: Input is a CSV file CSV file contains columns of different data types: Ordinal Values, Nominal Values, Numerical Values and Multi Value For the multivalue columns. Minimum is 1, ...
DILF Unboxing's user avatar
2votes
1answer
44views

I can't get my R² above 70%

I tried RandomForest, LGBM, Knneighbors, Polynomial Regression as algorithm's and cross-validation, train test split and standard scaler, nothing seem's to get it past the 70% mark. The dataframe has ...
user178825's user avatar
0votes
0answers
25views

Agglomerative clustering classifies 98% of my data in 1 cluster. Why?

I have a JSD distance matrix that I'm trying to cluster. When generating 24 clusters (roughly the amount the shows up on the clustermap), it assigns vast majority of the data as 1 cluster. Weirdly ...
youtube's user avatar
1vote
0answers
40views

OneClassSVM super slow training with poly kernel

In contrast to questions like here, where a slow SVM training results from a high number of samples, I only have around 500 samples. Still, a single training fold (cross-validation) takes several ...
UserPo41085's user avatar
1vote
1answer
35views

scipy bootstrap generates input with inconsistent numbers of samples

I have a dataset of 77 samples, and I am using scipy bootstrap to get a confidence interval to estimate the precision. I am baffled to see that it generates input variables with inconsistent numbers ...
Wouter De Coster's user avatar
2votes
1answer
71views

Why lightgbm .predict function has proba not between 0 and 1

I wanna understand why in this code, I get the following results: ...
Legna's user avatar
1vote
1answer
48views

Manual Python Implementation of Stacking Model

I tried to build a Python class, CustomStackingClassifier(), to implement the Stacking method in ensemble machine learning. In this implementation, the output of the base classifiers is set to be the ...
CM_Li's user avatar
3votes
1answer
81views

Comparing clusterings from different datasets

I have 2 different data sets with essentially the same variables, though one is data from one year and the other is data from another year. I've run KModes on both data sets and now have some ...
ethqnol's user avatar
2votes
2answers
228views

Fitting Rotated Curve

I'm trying to fit a rotated parabola with curve_fit, but it doesn't fit well as shown below: I'm already trying to fit the curve with respect to the cos(𝜃) and ...
Wong Wai Kwun's user avatar
0votes
0answers
37views

scikit-learn upgrade - how to fix breaking change?

i've inherated a solution that runs in databricks runtime 7.3, and it is using scikit-learn 0.21. Databricks runtime must be upgraded, and so existing scikit-learn version is not compatible with ...
otk's user avatar
  • 121
0votes
1answer
79views

As an intermediate R programmer looking to dive into machine learning, should I choose Python or stick with R?

Background I am an intermediate R programmer with some experience in machine learning concepts and simple modeling in R. I have an opportunity to collaborate with a professional machine learning team ...
a.sa.5969's user avatar
0votes
0answers
39views

Keep training pytorch model on new data

I'm working on a text classification task and have decided to use a PyTorch model for this purpose. The process mainly involves the following steps: Load and process the text. Use a TF-IDF Vectorizer....
Simon's user avatar

153050per page
close